Two-Stream 3D Convolutional Neural Network for Skeleton-Based Action Recognition

نویسندگان

  • Hong Liu
  • Juanhui Tu
  • Mengyuan Liu
چکیده

It remains a challenge to efficiently extract spatialtemporal information from skeleton sequences for 3D human action recognition. Although most recent action recognition methods are based on Recurrent Neural Networks which present outstanding performance, one of the shortcomings of these methods is the tendency to overemphasize the temporal information. Since 3D convolutional neural network(3D CNN) is a powerful tool to simultaneously learn features from both spatial and temporal dimensions through capturing the correlations between three-dimensional signals, this paper proposes a novel two-stream model using 3D CNN. To our best knowledge, this is the first application of 3D CNN in skeleton-based action recognition. Our method consists of three stages. First, skeleton joints are mapped into a 3D coordinate space and then encoding the spatial and temporal information, respectively. Second, 3D CNN models are seperately adopted to extract deep features from two streams. Third, to enhance the ability of deep features to capture global relationships, we extend every stream into multitemporal version. Extensive experiments on the SmartHome dataset and the large-scale NTU RGB-D dataset demonstrate that our method outperforms most of RNN-based methods, which verify the complementary property between spatial and temporal information and the robustness to noise.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hand Gesture Recognition from RGB-D Data using 2D and 3D Convolutional Neural Networks: a comparative study

Despite considerable enhances in recognizing hand gestures from still images, there are still many challenges in the classification of hand gestures in videos. The latter comes with more challenges, including higher computational complexity and arduous task of representing temporal features. Hand movement dynamics, represented by temporal features, have to be extracted by analyzing the total fr...

متن کامل

Action Recognition with Visual Attention on Skeleton Images

Action recognition with 3D skeleton sequences is becoming popular due to its speed and robustness. The recently proposed Convolutional Neural Networks (CNN) based methods have shown good performance in learning spatio-temporal representations for skeleton sequences. Despite the good recognition accuracy achieved by previous CNN based methods, there exist two problems that potentially limit the ...

متن کامل

Exploiting deep residual networks for human action recognition from skeletal data

The computer vision community is currently focusing on solving action recognition problems in real videos, which contain thousands of samples with many challenges. In this process, Deep Convolutional Neural Networks (D-CNNs) have played a significant role in advancing the state-of-the-art in various vision-based action recognition systems. Recently, the introduction of residual connections in c...

متن کامل

Action Recognition Based on Joint Trajectory Maps with Convolutional Neural Networks

Convolutional Neural Networks (ConvNets) have recently shown promising performance in many computer vision tasks, especially image-based recognition. How to effectively apply ConvNets to sequence-based data is still an open problem. This paper proposes an effective yet simple method to represent spatio-temporal information carried in 3D skeleton sequences into three 2D images by encoding the jo...

متن کامل

DMM-Pyramid Based Deep Architectures for Action Recognition with Depth Cameras

We propose a method for training deep convolutional neural networks (CNNs) to recognize the human actions captured by depth cameras. The depth maps and 3D positions of skeleton joints tracked by depth camera like Kinect sensors open up new possibilities of dealing with recognition task. Current methods mostly build classifiers based on complex features computed from the depth data. As a deep mo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1705.08106  شماره 

صفحات  -

تاریخ انتشار 2017